Skip to content

Conversation

@GNroy
Copy link
Owner

@GNroy GNroy commented Feb 15, 2023

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
super().__init__(num_classes, blank, topo_type, topo_with_self_loops, device, aux_graph)
if aux_graph is None:
self.den_graph = k2.create_fsa_vec([self.ctc_topo_inv.invert()]).to(self.device)
self.decoding_graph = k2.create_fsa_vec([self.ctc_topo_inv.invert()]).to(self.device)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute decoding_graph, which was previously defined in superclass [CtcNumGraphCompiler](1).
self.decoding_graph = k2.create_fsa_vec([self.ctc_topo_inv.invert()]).to(self.device)
else:
self.den_graph = k2.create_fsa_vec([self.base_graph.detach()]).to(self.device)
self.decoding_graph = k2.create_fsa_vec([self.base_graph.detach()]).to(self.device)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute decoding_graph, which was previously defined in superclass [CtcNumGraphCompiler](1).
Comment on lines +196 to +198
self.graph_compiler = CtcTopologyCompiler(
self.num_classes, self.blank, self.topo_type, self.topo_with_self_loops, self.device
)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute graph_compiler, which was previously defined in superclass [BaseDecoder](1).
self.graph_compiler = CtcTopologyCompiler(
self.num_classes, self.blank, self.topo_type, self.topo_with_self_loops, self.device
)
self.base_graph = k2.create_fsa_vec([self.graph_compiler.ctc_topo_inv.invert()]).to(self.device)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute base_graph, which was previously defined in superclass [BaseDecoder](1).
Comment on lines +237 to +244
self.graph_compiler = RnntTopologyCompiler(
self.num_classes,
self.blank,
self.topo_type,
self.topo_with_self_loops,
self.device,
max_adapter_length=self.predictor_window_size,
)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute graph_compiler, which was previously defined in superclass [BaseDecoder](1).
self.device,
max_adapter_length=self.predictor_window_size,
)
self.base_graph = self.graph_compiler.base_graph

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute base_graph, which was previously defined in superclass [BaseDecoder](1).
if backend == "k2":
if self.dec_type == "topo":
from nemo.collections.asr.parts.k2.graph_decoders import BaseDecoder as Decoder
from nemo.collections.asr.parts.k2.graph_decoders import CtcDecoder as Decoder

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [nemo.collections.asr.parts.k2.graph_decoders](1) begins an import cycle.
from nemo.collections.asr.parts.k2.graph_decoders import BaseDecoder as Decoder
from nemo.collections.asr.parts.k2.graph_decoders import CtcDecoder as Decoder
elif self.dec_type == "topo_rnnt_ali":
from nemo.collections.asr.parts.k2.graph_decoders import RnntAligner as Decoder

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [nemo.collections.asr.parts.k2.graph_decoders](1) begins an import cycle.
elif criterion_type == "map":
from nemo.collections.asr.parts.k2.map_loss import MAPLoss as K2Loss
if loss_type == "ctc":
from nemo.collections.asr.parts.k2.map_loss import CtcMmiLoss as K2Loss

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [nemo.collections.asr.parts.k2.map_loss](1) begins an import cycle.
GNroy and others added 2 commits February 15, 2023 10:01
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Comment on lines 205 to 210
def _intersect_calc_scores(
self,
log_probs: torch.Tensor,
targets: torch.Tensor,
input_lengths: torch.Tensor,
target_lengths: torch.Tensor,
) -> torch.Tensor:
assert self.graph_compiler is not None
boosted = self.boost_coeff != 0.0
if self.blank != 0:
# rearrange log_probs to put blank at the first place
# and shift targets to emulate blank = 0
log_probs, targets = make_blank_first(self.blank, log_probs, targets)
supervisions, order = create_supervision(input_lengths)
order = order.long()
targets = targets[order]
target_lengths = target_lengths[order]

if log_probs.device != self.graph_compiler.device:
self.graph_compiler.to(log_probs.device)

num_graphs, den_graph = self.graph_compiler.compile(
targets + 1 if self.pad_fsavec else targets, target_lengths
)
emissions_graphs: 'k2.DenseFsaVec',
supervision_graphs: Tuple['k2.Fsa', 'k2.Fsa'],
supervisions: torch.Tensor,
) -> Tuple[torch.Tensor, torch.Tensor]:

Check warning

Code scanning / CodeQL

Signature mismatch in overriding method

Overriding method '_intersect_calc_scores' has signature mismatch with [overridden method](1).
GNroy and others added 5 commits February 15, 2023 10:53
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
"""

import textwrap
from typing import Tuple

Check notice

Code scanning / CodeQL

Unused import

Import of 'Tuple' is not used.
# TODO @xueyang: deprecate this file since no other places import modules from here anymore. However,
# all checkpoints uploaded in ngc used this path. So it requires to update all ngc checkpoints g2p path as well.
from nemo_text_processing.g2p.modules import IPAG2P, BaseG2p, EnglishG2p
from nemo.collections.tts.g2p.modules import IPAG2P, BaseG2p, EnglishG2p

Check notice

Code scanning / CodeQL

Unused import

Import of 'IPAG2P' is not used. Import of 'BaseG2p' is not used. Import of 'EnglishG2p' is not used.
GNroy pushed a commit that referenced this pull request Jun 4, 2023
* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixed trace warnings

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Rearranging tests

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixing non-caching case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* testing

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed channel cache length issue

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* stash

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverting non-essential changes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Offset=None case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Remove test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Clean up speech_to_text_cache_aware_streaming_infer

Signed-off-by: Greg Clark <grclark@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert pad -> constant_pad_nd

Signed-off-by: Greg Clark <grclark@nvidia.com>

* conformer-encoder set window_size from streaming_cfg

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixes for working export(), using more constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Optional rand init for cahce

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Folding update_cache with constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* More folding

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #1

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff NVIDIA-NeMo#3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed unit tests, more reverts

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Export fixes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverted slice changes that ruined ONNX perf

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Adding back keep_all_outputs and drop_extra_preencoded

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fix export

Signed-off-by: Greg Clark <grclark@nvidia.com>

---------

Signed-off-by: Greg Clark <grclark@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
GNroy pushed a commit that referenced this pull request Dec 3, 2024
* upcycle dense to moe

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix(?) path when saving

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add unwrap method

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* move file

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
GNroy pushed a commit that referenced this pull request Dec 13, 2024
…#11500)

* Make HfDatasetDataModule a datasets.load_dataset wrapper

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update HFDatasetDataModule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* do not expand

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add synonym

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Add train/val/test attributes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add test for hf-datamodule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Import lazily to avoid breaking with older megatron versions

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* bot happy2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add doc-strings and collate-fn arg

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants